A study on the distribution of Airbnb Listings in Singapore, and how has COVID-19 impacted them.
Singapore is one of the few countries that does not legalise short-term rentals for properties, a minimum stay of three months is still required for rental of properties as of 2019. Therefore, it was surprising to discover data sets for Airbnb listings located in Singapore in Inside Airbnb. But we shall dive deeper into this phenomenon of Airbnb being allowed in Singapore perhaps another day.
For this particular exercise, we will utilising those data sets to generate useful insights through:
For analysis purposes, the following data are used:
Airbnb listings for June 2019 and June 2021 from Inside Airbnb
Various geospatial data sets of choice extracted from SLA OneMap Service, using OneMap.Sg API
Point location data of Mass Rapid Transit (MRT) Stations in Singapore from Land Transport Authority Datamall
To begin the study, R packages will be used for efficiency and a more comprehensive analysis, such as tidyverse and sf etc.
packages <- c("readr", "maptools", "sf", "raster", "spatstat", "tmap", "tidyverse",
"plotly", "ggthemes")
for (p in packages) {
if (!require(p, character.only = T)) {
install.packages(p)
}
library(p, character.only = T)
}
We will first import in data using the Airbnb (2019 and 2021) datasets, as we are interested in analysing the distribution of Airbnb listings Pre and Post COVID-19. For Pre-COVID context, the analysis aims to uncover whether the distribution of Airbnb listings are affected by location factors such as located near hotels or tourist attractions, so those datasets are imported in as well.
airbnb_2019 <- read_csv("data/the2data/airbnb2019.csv")
airbnb_2021 <- read_csv("data/the2data/airbnb2021.csv")
sg_hotels <- read_csv("data/the2data/OneMapData/hotels.csv")
tourist_attractions <- read_csv("data/the2data/OneMapData/tourism.csv")
We will also utilise an additional dataset from SLA OneMap, which is the Hawker Centres dataset. Singapore’s street food or known locally as hawker food culture has helped put the tiny country on the world map, with the hawker culture being featured in international food documentary series such as Netflix’s Street Food: Asia and also recently being recognised as a UNESCO heritage representative of Singapore.
Therefore, it would be interesting to see if the location of Airbnb listings would be located near more famous Hawker Centres, such as Maxwell Road Hawker Centre and Newton Food Centre etc.
The dataset will be imported using the One Map Api as shown in the below code chunk.
library(onemapsgapi)
hawker_centres <- get_theme("eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJzdWIiOjc5NTgsInVzZXJfaWQiOjc5NTgsImVtYWlsIjoianVucGVuZy50ZW8uMjAxOUBzbXUuZWR1LnNnIiwiZm9yZXZlciI6ZmFsc2UsImlzcyI6Imh0dHA6XC9cL29tMi5kZmUub25lbWFwLnNnXC9hcGlcL3YyXC91c2VyXC9zZXNzaW9uIiwiaWF0IjoxNjMxNzcyNzMzLCJleHAiOjE2MzIyMDQ3MzMsIm5iZiI6MTYzMTc3MjczMywianRpIjoiODc4OGY3NGI4Yjk5ZDZmZTI1NWFmNjExYjlkNTZiYjQifQ.N-KrWjJPHLJIQVwpBLjyDg6s0JWVNESINp7-7YjUEQI",
"hawkercentre")
The coordinates in the respective data frames were defined in values of Latitude and Longitude, therefore being congruent with EPSG 4326. However, Singapore’s national Project Coordination Systems should be EPSG 3414 (SVY21) instead, therefore we will transform the crs to EPSG 3414.
airbnb_2019 <- st_as_sf(airbnb_2019, coords = c("longitude", "latitude"), crs = 4326) %>%
st_transform(crs = 3414)
airbnb_2021 <- st_as_sf(airbnb_2021, coords = c("longitude", "latitude"), crs = 4326) %>%
st_transform(crs = 3414)
sg_hotels <- st_as_sf(sg_hotels, coords = c("Lng", "Lat"), crs = 4326) %>%
st_transform(crs = 3414)
hawker_centres <- st_as_sf(hawker_centres, coords = c("Lng", "Lat"), crs = 4326) %>%
st_transform(crs = 3414)
For tourist attractions, an extra step is needed because there are NA values, so we have to drop them. If not, we will get an error “missing values in coordinates not allowed”.
Checking and ensuring Projected Coordinates Systems of the sf objects is correct
st_crs(airbnb_2019)
st_crs(airbnb_2021)
st_crs(sg_hotels)
st_crs(hawker_centres)
st_crs(tourist_attractions)
Importing shapefile of MRT Stations locations using st_read()* of sf package. The output object will be in tibble sf object class.
We will be looking at the correlation between location of Airbnb listings and MRT stations in Singapore.
mrt_sf <- st_read(dsn = "data/the2data/MRT", layer = "MRTLRTStnPtt")
Reading layer `MRTLRTStnPtt' from data source
`C:\teojp3\IS415_blog\data\the2data\MRT' using driver `ESRI Shapefile'
Simple feature collection with 171 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 6138.311 ymin: 27555.06 xmax: 45254.86 ymax: 47854.2
Projected CRS: SVY21
The output of the above code chunk showed that projection of mrt_sf is in SVY21, so we are good to proceed on because it is the correct projected coordinates system.
The code chunk below uses as_Spatial() of sf package to convert the respective geospatial data from simple feature data frame to sp’s Spatial* class.
airbnb_2019_sp <- as_Spatial(airbnb_2019)
airbnb_2021_sp <- as_Spatial(airbnb_2021)
hotels_sp <- as_Spatial(sg_hotels)
hawker_sp <- as_Spatial(hawker_centres)
tourist_attractions_sp <- as_Spatial(tourist_attractions)
mrt_sp <- as_Spatial(mrt_sf)
Taking a look at a chosen few of the above data frames to ensure that they are converted to Spatial* classes successfully.
airbnb_2019_sp
class : SpatialPointsDataFrame
features : 8293
extent : 7215.566, 44098.31, 25166.35, 49226.35 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
variables : 14
names : id, name, host_id, host_name, neighbourhood_group, neighbourhood, room_type, price, minimum_nights, number_of_reviews, last_review, reviews_per_month, calculated_host_listings_count, availability_365
min values : 49091, -, 23666, (Email hidden by Airbnb), Central Region, Ang Mo Kio, Entire home/apt, 0, 1, 0, 15656, 0.01, 1, 0
max values : 36053005, ZR2- NEW! Sunny & Modern Apt 4 mins to Orchard Rd, 271165196, Zuzana, West Region, Yishun, Shared room, 13999, 1000, 308, 18072, 12.09, 277, 365
hotels_sp
class : SpatialPointsDataFrame
features : 422
extent : 5939.241, 45334.18, 25379.44, 44562.4 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
variables : 7
names : NAME, ADDRESSPOSTALCODE, ADDRESSSTREETNAME, HYPERLINK, TOTALROOMS, KEEPERNAME, ICON_NAME
min values : 30 BENCOOLEN, 18956, 1 Bayfront Avenue, 96ytlim@gmail.com, 4, Adel Aramouni, hotel.gif
max values : YotelAir Singapore Changi Airport, 819666, 99 IRRAWADDY ROAD, # 22-00 ROYAL SQUARE AT NOVENA, zubair@dam.com.sg, 2561, Zhang YuanQing, hotel.gif
This conversion is required because spatstat only takes in analytical data in ppp object form.
The below code chunks converts the Spatial* classes into generic sp objects.
airbnb_2019_sp <- as(airbnb_2019_sp, "SpatialPoints")
airbnb_2021_sp <- as(airbnb_2021_sp, "SpatialPoints")
hotels_sp <- as(hotels_sp, "SpatialPoints")
hawker_sp <- as(hawker_sp, "SpatialPoints")
tourist_attractions_sp <- as(tourist_attractions_sp, "SpatialPoints")
mrt_sp <- as(mrt_sp, "SpatialPoints")
Taking a look at a chosen few of the above Spatial* classes to ensure that they are converted to generic sp objects successfully.
hawker_sp
class : SpatialPoints
features : 125
extent : 12874.19, 45241.4, 28355.97, 47872.53 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +ellps=WGS84 +towgs84=0,0,0,0,0,0,0 +units=m +no_defs
mrt_sp
class : SpatialPoints
features : 171
extent : 6138.311, 45254.86, 27555.06, 47854.2 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=m +no_defs
In this step, we will use as.ppp() function of spatstat to convert the spatial data into spatstat’s ppp object format.
airbnb_2019_ppp <- as(airbnb_2019_sp, "ppp")
airbnb_2021_ppp <- as(airbnb_2021_sp, "ppp")
hotels_ppp <- as(hotels_sp, "ppp")
hawker_ppp <- as(hawker_sp, "ppp")
tourist_attractions_ppp <- as(tourist_attractions_sp, "ppp")
mrt_ppp <- as(mrt_sp, "ppp")
plot(airbnb_2019_ppp)

summary(airbnb_2019_ppp)
Planar point pattern: 8293 points
Average intensity 9.345289e-06 points per square unit
*Pattern contains duplicated points*
Coordinates are given to 3 decimal places
i.e. rounded to the nearest multiple of 0.001 units
Window: rectangle = [7215.57, 44098.31] x [25166.35, 49226.35] units
(36880 x 24060 units)
Window area = 887399000 square units
Duplicated points are commonly found in ppp objects as shown in the output above, but the presence of duplicates will affect the accuracy of the spatial point patterns analysis, therefore we have to handle these duplicates in this section.
We will use the jittering approach, which adds a small perturbation to the duplicate points so they would not occupy the exact same space and point as their originals.
airbnb2019_ppp_jit <- rjitter(airbnb_2019_ppp, retry = TRUE, nsim = 1, drop = TRUE)
airbnb2021_ppp_jit <- rjitter(airbnb_2021_ppp, retry = TRUE, nsim = 1, drop = TRUE)
hawker_ppp_jit <- rjitter(hawker_ppp, retry = TRUE, nsim = 1, drop = TRUE)
tourism_ppp_jit <- rjitter(tourist_attractions_ppp, retry = TRUE, nsim = 1, drop = TRUE)
hotels_ppp_jit <- rjitter(hotels_ppp, retry = TRUE, nsim = 1, drop = TRUE)
mrt_ppp_jit <- rjitter(mrt_ppp, retry = TRUE, nsim = 1, drop = TRUE)
any(duplicated(airbnb2019_ppp_jit))
[1] FALSE
When doing spatial point patterns analysis, a good practice to always follow is to confine the analysis with a geographical area such as Singapore’s boundary. In spatstat, this polygonal region is known as owin, an object specially designed to represent this boundary.
sg <- st_read(dsn = "data/the2data/sgdata", layer = "CostalOutline")
Reading layer `CostalOutline' from data source
`C:\teojp3\IS415_blog\data\the2data\sgdata' using driver `ESRI Shapefile'
Simple feature collection with 60 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 2663.926 ymin: 16357.98 xmax: 56047.79 ymax: 50244.03
Projected CRS: SVY21
sg <- st_transform(sg, crs = 3414)
sg_sp <- as(as_Spatial(sg), "SpatialPolygons")
sg <- st_make_valid(sg)
The code chunk below is used to convert sg SpatialPolygon object into owin object of spatstat.
sg_owin <- as(sg_sp, "owin")
The output object combined both the point and polygon feature in one ppp object class as shown below.
airbnb_2019_ppp = airbnb2019_ppp_jit[sg_owin]
plot(airbnb_2019_ppp)

tmap_mode("plot")
tm_shape(airbnb_2019) + tm_dots(alpha = 0.4, col = "blue", size = 0.05) + tm_shape(sg_hotels) +
tm_dots(alpha = 0.4, col = "red", size = 0.05) + tm_shape(tourist_attractions) +
tm_dots(alpha = 0.4, col = "green", size = 0.05)

First-order Spatial Point Patterns Analysis (SPPA) will be performed using spatstat package.
Kernel Density Maps will be used to unravel spatial patterns between the location of listings and various location factors mentioned earlier on. The advantage of using Kernel Density over Point Density is that the results are much more spatially accurate, because Point Density usually produces more steep edges which is not desirable.
On the other hand, Kernel density gives smoother results that allows for plotting nicer visuals, which is crucial for getting substantial analysis insights.
#### Kernel Density Map of Airbnb Listings
# kde_airbnb2019_bw_original <- density(airbnb_2019_ppp, sigma=bw.diggle,
# edge=TRUE, kernel='gaussian')
# kde_airbnb2019_bw <- density(airbnb2019_ppp_owin, sigma=bw.diggle, edge=TRUE,
# kernel='gaussian') gridded_kde_airbnb2019_bw <-
# as.SpatialGridDataFrame.im(kde_airbnb2019_bw)
# spplot(gridded_kde_airbnb2019_bw, main='Airbnb Listings')
# bw <- bw <- bw.diggle(airbnb_2019_ppp) bw
kde_airbnb2019_bw <- density(airbnb_2019_ppp, sigma = bw.diggle, edge = TRUE, kernel = "gaussian")
plot(kde_airbnb2019_bw, main = "Airbnb Listings")

# Convert units to km
kde_airbnb2019.km <- rescale(airbnb_2019_ppp, 1000, "km")
kde_airbnb2019.bw <- density(kde_airbnb2019.km, sigma = 1, edge = TRUE, kernel = "gaussian")
plot(kde_airbnb2019.bw, main = "Airbnb Listings")

gridded_kde_airbnb2019_bw <- as.SpatialGridDataFrame.im(kde_airbnb2019.bw)
spplot(gridded_kde_airbnb2019_bw, main = "Airbnb Listings")

kde_airbnb_2019_bw_raster <- raster(gridded_kde_airbnb2019_bw)
kde_airbnb_2019_bw_raster
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.4170614, 0.2647348 (x, y)
extent : 2.663926, 56.04779, 16.35798, 50.24403 (xmin, xmax, ymin, ymax)
crs : NA
source : memory
names : v
values : -1.925625e-14, 212.2338 (min, max)
projection(kde_airbnb_2019_bw_raster) <- CRS("+init=EPSG:3414 +datum=WGS84 +units=km")
kde_airbnb_2019_bw_raster
class : RasterLayer
dimensions : 128, 128, 16384 (nrow, ncol, ncell)
resolution : 0.4170614, 0.2647348 (x, y)
extent : 2.663926, 56.04779, 16.35798, 50.24403 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=1.36666666666667 +lon_0=103.833333333333 +k=1 +x_0=28001.642 +y_0=38744.572 +datum=WGS84 +units=km +no_defs
source : memory
names : v
values : -1.925625e-14, 212.2338 (min, max)
tmap_mode("view")
tm_shape(sg) + tm_borders(col = "black", lwd = 1, alpha = 0.5) + tm_shape(kde_airbnb_2019_bw_raster) +
tm_raster("v", alpha = 0.7) + tm_layout(legend.position = c("right", "bottom"),
frame = FALSE, title = "Airbnb Listings 2019") + tm_basemap("OpenStreetMap")
plot(kde_airbnb2019_bw)

tmap_mode("view")
tm_shape(airbnb_2019) + tm_dots(alpha = 0.4, col = "blue", size = 0.05) + tm_shape(sg_hotels) +
tm_dots(alpha = 0.4, col = "red", size = 0.05) + tm_shape(tourist_attractions) +
tm_dots(alpha = 0.4, col = "green", size = 0.05)